A Misspelling Intelligent Analysis Approach for Correcting Misspelled Words in English Text

نویسندگان

  • Guimin Huang
  • Yanzhou Huang
  • Yan Zhang
  • Ya Zhou
چکیده

This paper proposes an innovative MIA (Misspelling Intelligent Analysis) approach for efficient detection and intelligent correction of misspelled words. An integrity spelling correction approach is needed to consider both non-word errors and real-word errors. The MIA approach takes advantage of word frequency statistics, lexicon data, character distance and conditional probability for ranking suggestions of each misspelling having non-word errors. Drawing upon the context information, the overall score or probability is calculated and regarded as an access key for real-word errors correction in the MIA approach. Especially, features compensation and combination are provided so as to improve the accuracy of real-word errors correction in the articles of Chinese students. Finally, the experiments show that the MIA approach is capable of providing a better performance of error detection, discrimination and correction than current methods of dealing with misspelled words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-word identification or spell checking without a dictionary

s is about 0.0015 while in titles it is near 0.0006. From 1980 on, the proportion of mistakes in abstracts is about 0.0002 while in titles it is less than 0.0001. As often as abstracts are reviewed for mistakes, the title of a paper is considered even more. The drop in the proportion of misspelled words in the early 1980s coincides with the widespread adoption of word processors and personal co...

متن کامل

Not the Word I Wanted? How Online English Learners' Dictionaries Deal with Misspelled Words

This study looks at how well the leading monolingual English learners’ dictionaries in their online versions cope with misspelled words as search terms. Six such dictionaries are tested on a corpus of misspellings produced by Polish, Japanese, and Finnish learners of English. The performance of the dictionaries varies widely, but is in general poor. For a large proportion of cases, dictionaries...

متن کامل

Material Development and English for Academic Purposes Word Lists; a Reductionist Approach

Nagy (1988) states that vocabulary is a prerequisite factor in comprehension. Drawing upon a reductionist approach and having in mind the prospects for material development, this study aimed at creating an English for Academic Purposes Word List (EAPWL). The corpus of this study was compiled from a corpus containing 6479 pages of texts, 2,081,678 million tokens (running words) and 63825 types (...

متن کامل

字形相似別字之自動校正方法 (Automatic Correction for Graphemic Chinese Misspelled Words) [In Chinese]

No matter that learning Chinese as a first or second language, a quite important issue, misspelled words, needs to be addressed. Many studies proposed that there was a suggestion of correcting misspelled words for students who are still schooling as well as a suggestion of teaching and learning strategies of Chinese characters for teachers. Although in schooling, it does to prevent students who...

متن کامل

Bilingual Random Walk Models for Automated Grammar Correction of ESL Author-Produced Text

We present a novel noisy channel model for correcting text produced by English as a second language (ESL) authors. We model the English word choices made by ESL authors as a random walk across an undirected bipartite dictionary graph composed of edges between English words and associated words in an author’s native language. We present two such models, using cascades of weighted finitestate tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCIT

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2010